bias detection
Bridging Human and Model Perspectives: A Comparative Analysis of Political Bias Detection in News Media Using Large Language Models
Banik, Shreya Adrita, Rahman, Niaz Nafi, Moiukh, Tahsina, Sadeque, Farig
Detecting political bias in news media is a complex task that requires interpreting subtle linguistic and contextual cues. Although recent advances in Natural Language Processing (NLP) have enabled automatic bias classification, the extent to which large language models (LLMs) align with human judgment still remains relatively underexplored and not yet well understood. This study aims to present a comparative framework for evaluating the detection of political bias across human annotations and multiple LLMs, including GPT, BERT, RoBERTa, and FLAN. We construct a manually annotated dataset of news articles and assess annotation consistency, bias polarity, and inter-model agreement to quantify divergence between human and model perceptions of bias. Experimental results show that among traditional transformer-based models, RoBERTa achieves the highest alignment with human labels, whereas generative models such as GPT demonstrate the strongest overall agreement with human annotations in a zero-shot setting. Among all transformer-based baselines, our fine-tuned RoBERTa model acquired the highest accuracy and the strongest alignment with human-annotated labels. Our findings highlight systematic differences in how humans and LLMs perceive political slant, underscoring the need for hybrid evaluation frameworks that combine human interpretability with model scalability in automated media bias detection.
- North America > United States (0.28)
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Asia > Middle East > Iran (0.14)
- (9 more...)
- Media > News (0.70)
- Information Technology (0.68)
- Government > Regional Government (0.47)
Argumentative Debates for Transparent Bias Detection [Technical Report]
Ayoobi, Hamed, Potyka, Nico, Rapberger, Anna, Toni, Francesca
As the use of AI in society grows, addressing emerging biases is essential to prevent systematic discrimination. Several bias detection methods have been proposed, but, with few exceptions, these tend to ignore transparency. Instead, interpretability and explainability are core requirements for algorithmic fairness, even more so than for other algorithmic solutions, given the human-oriented nature of fairness. We present ABIDE (Argumentative BIas detection by DEbate), a novel framework that structures bias detection transparently as debate, guided by an underlying argument graph as understood in (formal and computational) argumentation. The arguments are about the success chances of groups in local neighbourhoods and the significance of these neighbourhoods. We evaluate ABIDE experimentally and demonstrate its strengths in performance against an argumentative baseline.
- Europe > United Kingdom > England > Greater London > London (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Netherlands > Groningen (0.04)
- Europe > Germany > North Rhine-Westphalia > Arnsberg Region > Dortmund (0.04)
Evaluating LLMs for Demographic-Targeted Social Bias Detection: A Comprehensive Benchmark Study
Majumdar, Ayan, Chen, Feihao, Li, Jinghui, Wang, Xiaozhen
Large-scale web-scraped text corpora used to train general-purpose AI models often contain harmful demographic-targeted social biases, creating a regulatory need for data auditing and developing scalable bias-detection methods. Although prior work has investigated biases in text datasets and related detection methods, these studies remain narrow in scope. They typically focus on a single content type (e.g., hate speech), cover limited demographic axes, overlook biases affecting multiple demographics simultaneously, and analyze limited techniques. Consequently, practitioners lack a holistic understanding of the strengths and limitations of recent large language models (LLMs) for automated bias detection. In this study, we present a comprehensive evaluation framework aimed at English texts to assess the ability of LLMs in detecting demographic-targeted social biases. To align with regulatory requirements, we frame bias detection as a multi-label task using a demographic-focused taxonomy. We then conduct a systematic evaluation with models across scales and techniques, including prompting, in-context learning, and fine-tuning. Using twelve datasets spanning diverse content types and demographics, our study demonstrates the promise of fine-tuned smaller models for scalable detection. However, our analyses also expose persistent gaps across demographic axes and multi-demographic targeted biases, underscoring the need for more effective and scalable auditing frameworks.
- North America > United States > Oregon (0.04)
- North America > Dominican Republic (0.04)
- Europe > Germany > Saarland > Saarbrücken (0.04)
- (3 more...)
- Law (1.00)
- Government > Regional Government (0.93)
Applications and Challenges of Fairness APIs in Machine Learning Software
Das, Ajoy, Uddin, Gias, Chowdhury, Shaiful, Akhond, Mostafijur Rahman, Hemmati, Hadi
Machine Learning software systems are frequently used in our day-to-day lives. Some of these systems are used in various sensitive environments to make life-changing decisions. Therefore, it is crucial to ensure that these AI/ML systems do not make any discriminatory decisions for any specific groups or populations. In that vein, different bias detection and mitigation open-source software libraries (aka API libraries) are being developed and used. In this paper, we conduct a qualitative study to understand in what scenarios these open-source fairness APIs are used in the wild, how they are used, and what challenges the developers of these APIs face while developing and adopting these libraries. We have analyzed 204 GitHub repositories (from a list of 1885 candidate repositories) which used 13 APIs that are developed to address bias in ML software. We found that these APIs are used for two primary purposes (i.e., learning and solving real-world problems), targeting 17 unique use-cases. Our study suggests that developers are not well-versed in bias detection and mitigation; they face lots of troubleshooting issues, and frequently ask for opinions and resources. Our findings can be instrumental for future bias-related software engineering research, and for guiding educators in developing more state-of-the-art curricula.
- North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Greece > Attica > Athens (0.04)
- (23 more...)
- Education > Educational Technology > Educational Software > Computer Based Training (0.91)
- Education > Educational Setting (0.65)
- Information Technology > Software (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.92)
- Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.67)
Stereotype Detection as a Catalyst for Enhanced Bias Detection: A Multi-Task Learning Approach
Tomar, Aditya, Murthy, Rudra, Bhattacharyya, Pushpak
Bias and stereotypes in language models can cause harm, especially in sensitive areas like content moderation and decision-making. This paper addresses bias and stereotype detection by exploring how jointly learning these tasks enhances model performance. We introduce StereoBias, a unique dataset labeled for bias and stereotype detection across five categories: religion, gender, socio-economic status, race, profession, and others, enabling a deeper study of their relationship. Our experiments compare encoder-only models and fine-tuned decoder-only models using QLoRA. While encoder-only models perform well, decoder-only models also show competitive results. Crucially, joint training on bias and stereotype detection significantly improves bias detection compared to training them separately. Additional experiments with sentiment analysis confirm that the improvements stem from the connection between bias and stereotypes, not multi-task learning alone. These findings highlight the value of leveraging stereotype information to build fairer and more effective AI systems.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > United States > Washington > King County > Seattle (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (10 more...)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.68)
Perception-Driven Bias Detection in Machine Learning via Crowdsourced Visual Judgment
Tupakula, Chirudeep, Shamsuddin, Rittika
Machine learning systems are increasingly deployed in high-stakes domains, yet they remain vulnerable to bias systematic disparities that disproportionately impact specific demographic groups. Traditional bias detection methods often depend on access to sensitive labels or rely on rigid fairness metrics, limiting their applicability in real-world settings. This paper introduces a novel, perception-driven framework for bias detection that leverages crowdsourced human judgment. Inspired by reCAPTCHA and other crowd-powered systems, we present a lightweight web platform that displays stripped-down visualizations of numeric data (for example-salary distributions across demographic clusters) and collects binary judgments on group similarity. We explore how users' visual perception-shaped by layout, spacing, and question phrasing can signal potential disparities. User feedback is aggregated to flag data segments as biased, which are then validated through statistical tests and machine learning cross-evaluations. Our findings show that perceptual signals from non-expert users reliably correlate with known bias cases, suggesting that visual intuition can serve as a powerful, scalable proxy for fairness auditing. This approach offers a label-efficient, interpretable alternative to conventional fairness diagnostics, paving the way toward human-aligned, crowdsourced bias detection pipelines.
- North America > United States > Oklahoma > Payne County > Stillwater (0.04)
- North America > United States > California (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.94)
BiasGuard: A Reasoning-enhanced Bias Detection Tool For Large Language Models
Fan, Zhiting, Chen, Ruizhe, Liu, Zuozhu
Identifying bias in LLM-generated content is a crucial prerequisite for ensuring fairness in LLMs. Existing methods, such as fairness classifiers and LLM-based judges, face limitations related to difficulties in understanding underlying intentions and the lack of criteria for fairness judgment. In this paper, we introduce BiasGuard, a novel bias detection tool that explicitly analyzes inputs and reasons through fairness specifications to provide accurate judgments. BiasGuard is implemented through a two-stage approach: the first stage initializes the model to explicitly reason based on fairness specifications, while the second stage leverages reinforcement learning to enhance its reasoning and judgment capabilities. Our experiments, conducted across five datasets, demonstrate that BiasGuard outperforms existing tools, improving accuracy and reducing over-fairness misjudgments. We also highlight the importance of reasoning-enhanced decision-making and provide evidence for the effectiveness of our two-stage optimization pipeline.
Bias Detection via Signaling
We introduce and study the problem of detecting whether an agent is updating their prior beliefs given new evidence in an optimal way that is Bayesian, or whether they are biased towards their own prior. In our model, biased agents form posterior beliefs that are a convex combination of their prior and the Bayesian posterior, where the more biased an agent is, the closer their posterior is to the prior. Since we often cannot observe the agent's beliefs directly, we take an approach inspired by information design. Specifically, we measure an agent's bias by designing a signaling scheme and observing the actions they take in response to different signals, assuming that they are maximizing their own expected utility; our goal is to detect bias with a minimum number of signals. Our main results include a characterization of scenarios where a single signal suffices and a computationally efficient algorithm to compute optimal signaling schemes.
CUBIC: Concept Embeddings for Unsupervised Bias Identification using VLMs
Méndez, David, Bontempo, Gianpaolo, Ficarra, Elisa, Confalonieri, Roberto, Díaz-Rodríguez, Natalia
Deep vision models often rely on biases learned from spurious correlations in datasets. To identify these biases, methods that interpret high-level, human-understandable concepts are more effective than those relying primarily on low-level features like heatmaps. A major challenge for these concept-based methods is the lack of image annotations indicating potentially bias-inducing concepts, since creating such annotations requires detailed labeling for each dataset and concept, which is highly labor-intensive. We present CUBIC (Concept embeddings for Unsupervised Bias IdentifiCation), a novel method that automatically discovers interpretable concepts that may bias classifier behavior. Unlike existing approaches, CUBIC does not rely on predefined bias candidates or examples of model failures tied to specific biases, as such information is not always available. Instead, it leverages image-text latent space and linear classifier probes to examine how the latent representation of a superclass label$\unicode{x2014}$shared by all instances in the dataset$\unicode{x2014}$is influenced by the presence of a given concept. By measuring these shifts against the normal vector to the classifier's decision boundary, CUBIC identifies concepts that significantly influence model predictions. Our experiments demonstrate that CUBIC effectively uncovers previously unknown biases using Vision-Language Models (VLMs) without requiring the samples in the dataset where the classifier underperforms or prior knowledge of potential biases.
- Europe > Switzerland > Zürich > Zürich (0.14)
- South America > Argentina > Patagonia > Río Negro Province > Viedma (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (2 more...)
- Research Report > New Finding (0.46)
- Research Report > Promising Solution (0.34)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Natural Language (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.46)